Skip to content

[QNN EP] Fix 16x16 MatMul translation #24846

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 5, 2025

Conversation

quic-tirupath
Copy link
Contributor

Description

  • QNN's 16x16 FC doesn't support asymmetric int16 weight
  • QNN's 16x16 MatMul doesn't support asymmetric int16 weight initializer.
  • Insert Convert Op to convert from asymmetric uint16 weight to symmetric int16 weight.
  • Add unit tests to verify 16x16 MatMul translations.

Motivation and Context

  • This fix schedules 16x16 MatMul Ops on QNN HTP accelerator.
  • This improves inference time of Models contain 16x16 MatMul operators

@HectorSVC HectorSVC added the ep:QNN issues related to QNN exeution provider label May 23, 2025
@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@quic-tirupath
Copy link
Contributor Author

Analysis of failures reported in CI pipeline:

  • Linux QNN CI Pipeline is failing at the newly added Unit test. I can verify the Unit test is passing on my local setup. Seems the SoC being used in Linux QNN CI pipeline verification is very old and INT16 is not supported on this SoC. We will keep the unit test enabled for Windows toolchains until SoC is upgraded for Linux QNN CI pipeline testing.

  • Window QNN CI pipeline failures are intermittent i guess and not related to the code changes made as part of this PR. They may not be seen in re-run.

@HectorSVC
Copy link
Contributor

Analysis of failures reported in CI pipeline:

  • Linux QNN CI Pipeline is failing at the newly added Unit test. I can verify the Unit test is passing on my local setup. Seems the SoC being used in Linux QNN CI pipeline verification is very old and INT16 is not supported on this SoC. We will keep the unit test enabled for Windows toolchains until SoC is upgraded for Linux QNN CI pipeline testing.
  • Window QNN CI pipeline failures are intermittent i guess and not related to the code changes made as part of this PR. They may not be seen in re-run.

we only enabled HTP tests for Linux environment, but it is using QNN simulator, not real device.

@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@quic-tirupath
Copy link
Contributor Author

@HectorSVC
The commit is completely QNN EP related. Seems Web CI pipelines ran for long time and finally canceled. Could you please check the problems with Web CI pipelines and unblock this commit.

Thanks in advance.

@HectorSVC
Copy link
Contributor

There was a fix for the Web CI pipeline, please merge the code from latest main branch.

 - QNN's 16x16 FC doesn't support asymmetric int16 weight
 - QNN's 16x16 MatMul doesn't support asymmetric int16 weight
   initializer.
 - Insert Convert Op to convert from asymmetric uint16 weight
   to symmetric int16 weight.
 - Add unit tests to verify 16x16 MatMul translations.
 - MatMul 16x16 is supported on few hardwares
 - Disable MatMul 16x16 unit tests for linux platforms
@quic-tirupath quic-tirupath force-pushed the dev/tirupath/matmul_16x16 branch from aae6e64 to 698f087 Compare June 2, 2025 18:23
@quic-tirupath
Copy link
Contributor Author

There was a fix for the Web CI pipeline, please merge the code from latest main branch.

@HectorSVC Thanks for the inputs. I merged the code from latest main branch. Could you please re-trigger CI pipeline?

@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

Copy link

Azure Pipelines successfully started running 5 pipeline(s).

@HectorSVC HectorSVC merged commit 46caf47 into microsoft:main Jun 5, 2025
82 checks passed
javier-intel pushed a commit to intel/onnxruntime that referenced this pull request Jun 15, 2025
### Description
 - QNN's 16x16 FC doesn't support asymmetric int16 weight
- QNN's 16x16 MatMul doesn't support asymmetric int16 weight initializer.
- Insert Convert Op to convert from asymmetric uint16 weight to symmetric int16 weight.
 - Add unit tests to verify 16x16 MatMul translations.



### Motivation and Context
- This fix schedules 16x16 MatMul Ops on QNN HTP accelerator.
- This improves inference time of Models contain 16x16 MatMul operators
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:QNN issues related to QNN exeution provider
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants